ARMOR: Association Rule Mining based on ORacle
نویسندگان
چکیده
In this paper, we first focus our attention on the question of how much space remains for performance improvement over current association rule mining algorithms. Our strategy is to compare their performance against an “Oracle algorithm” that knows in advance the identities of all frequent itemsets in the database and only needs to gather their actual supports to complete the mining process. Our experimental results show that current mining algorithms do not perform uniformly well with respect to the Oracle for all database characteristics and support thresholds. In many cases there is a substantial gap between the Oracle’s performance and that of the current mining algorithms. Second, we present a new mining algorithm, called ARMOR, that is constructed by making minimal changes to the Oracle algorithm. ARMOR consistently performs within a factor of two of the Oracle on both real and synthetic datasets over practical ranges of support specifications.
منابع مشابه
How Good Are Association-Rule Mining Algorithms?
We address the question of how much space remains for performance improvement over current association rule mining algorithms. Our approach is to compare their performance against an “Oracle algorithm” that knows in advance the identities of all frequent itemsets in the database and only needs to gather the actual supports of these itemsets, in one scan over the database, to complete the mining...
متن کاملOn the Optimality of Association-rule Mining Algorithms
Since its introduction close to a decade ago, the problem of efficient mining of association rules on market-basket data has attracted tremendous attention. Numerous algorithms have been proposed, each one in turn claiming to outperform its predecessors on a representative set of databases. In this paper, we first focus our attention on the question of how much space remains for performance imp...
متن کاملOn the Efficiency of Association-Rule Mining Algorithms
In this paper, we first focus our attention on the question of how much space remains for performance improvement over current association rule mining algorithms. Our strategy is to compare their performance against an “Oracle algorithm” that knows in advance the identities of all frequent itemsets in the database and only needs to gather their actual supports to complete the mining process. Ou...
متن کاملEÆcient Discovery of Concise Association Rules from Large Databases
Association rules are interesting correlations among attributes in a database. These rules have many applications in areas ranging from e-commerce to sports to census analysis to medical diagnosis. The discovery of association rules is an extremely computationally expensive task and it is therefore imperative to have fast scalable algorithms for mining these rules. In this thesis, we present eÆ...
متن کاملData sanitization in association rule mining based on impact factor
Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003